fix(mito2): schema-safe skipping index pruning#8122
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces logic to handle column type changes in bloom filter indexing by verifying type compatibility between region metadata and SST metadata. It adds an SstApplyPlan to manage predicates and ensures that indexes are only applied when types match, preventing issues after an ALTER TABLE operation. Review feedback suggests optimizing the SstApplyPlan by excluding columns that are entirely missing from the SST to improve cache key consistency and reduce unnecessary map entries.
fdd3f00 to
59c4cd2
Compare
|
@codex review |
|
Codex Review: Didn't find any major issues. Chef's kiss. ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request implements a compatibility verification for bloom filter predicates against SST metadata, ensuring that indexes are only applied when column types match. Key changes include the addition of compatible_predicate_for_sst to the BloomFilterIndexApplier and its integration into the Parquet row group pruning process. Review feedback identifies an opportunity to optimize this check by excluding columns missing from the SST, which would enhance caching efficiency and prevent redundant processing during index application.
There was a problem hiding this comment.
Pull request overview
This PR makes mito2 skipping-index pruning safer across schema evolution by comparing predicate column types against each SST’s stored metadata before applying bloom-filter pruning.
Changes:
- Adds per-SST compatible predicate selection for bloom-filter index pruning.
- Updates reader cache-key generation and apply calls to use the compatible predicate subset.
- Adds regression coverage for changing a skipping-indexed column type.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
src/mito2/src/sst/index/bloom_filter/applier.rs |
Adds compatible predicate filtering and updates bloom-filter apply flow. |
src/mito2/src/sst/index/bloom_filter/applier/builder.rs |
Tracks expected predicate column types from region metadata. |
src/mito2/src/sst/parquet/reader.rs |
Passes SST metadata into bloom pruning and computes per-SST cache keys. |
src/mito2/src/sst/parquet.rs |
Updates tests for the new bloom predicate key flow. |
tests/cases/standalone/common/alter/change_col_type_skipping_index.sql / .result |
Adds SQL regression test for skipping-index behavior after column type changes. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Made some minor refactors, ptal @evenyag |
|
@codex review |
|
Codex Review: Didn't find any major issues. What shall we delve into next? ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
I hereby agree to the terms of the GreptimeDB CLA.
Refer to a related PR or issue link (optional)
#8074 but skipping index.
What's changed and what's your intention?
This pr mainly change
PR Checklist
Please convert it to a draft if some of the following conditions are not met.